Modelling pronunciation variation with single-path and multi-path syllable models: Issues to consider

نویسندگان

  • Annika Hämäläinen
  • Louis ten Bosch
  • Lou Boves
چکیده

In this paper, we construct context-independent single-path and multi-path syllable models aimed at improved pronunciation variation modelling. We use phonetic transcriptions to define the topologies of the syllable models and to initialise the model parameters, and the Baum-Welch algorithm for the re-estimation of the model parameters. We hypothesise that the richer topology of multi-path syllable models would be better at accounting for pronunciation variation than context-dependent phone models that can only account for the effects of the left and right neighbours, or single-path syllable models whose power of modelling segmental variation would seem to be limited. However, both context-dependent phone models and single-path syllable models outperform multi-path syllable models on a large vocabulary continuous speech recognition task. Careful analyses of the errors made by the recognisers with single-path and multi-path syllable models show that the most important factors affecting the speech recognition performance are syllable context and lexical confusability. In addition, the speech recognition results suggest that the benefits of the greater acoustic modelling accuracy of the multi-path syllable models can only be reaped if the information about the syllable-level pronunciation variation can be linked with the word-level information in the language model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Construction and analysis of multiple paths in syllable models

In this paper, we construct multi-path syllable models using phonetic knowledge for initialising the parallel paths, and a data-driven solution for their re-estimation. We hypothesise that the richer topology of multi-path syllable models would be better at accounting for pronunciation variation than context-dependent phone models that can only account for the effects of left and right neighbou...

متن کامل

Whither Linguistic Interpretation of Acoustic Pronunciation Variation

Recent research suggests that modelling pronunciation variation is more appropriate at the syllable level than at the level of contextdependent phones. Due to the large number of factors affecting syllable pronunciation, the creation of multi-path topologies is nec­ essary. Previous research on multi-path models in connected digit recognition has proved trajectory clustering to be an attractive...

متن کامل

Pronunciation variant-based multi-path HMMs for syllables

Recent research suggests that it is more appropriate to model pronunciation variation with syllable-length acoustic models than with context-dependent phones. Due to the large number of factors contributing to pronunciation variation at the syllable level, the creation of multi-path model topologies appears necessary. In this paper, we propose a novel approach for constructing multi-path models...

متن کامل

Multi-path Syllable Models Based on Phonetic Knowledge

Recent research suggests that syllable-length acoustic models might be more appropriate for pronunciation variation modelling than the context-dependent phones that conventional automatic speech recognisers use. In this paper, we compare the recognition performance of two types of recognisers: a conventional recogniser that only uses triphones, and an experimental recogniser that employs a mix ...

متن کامل

Syllable-length path mixture hidden Markov models with trajectory clustering for continuous speech recognition

Recent research suggests that modeling coarticulation in speech is more appropriate at the syllable level. However, due to a number of additional factors that can affect the way syllables are articulated, creating multiple acoustic models per syllable might be necessary. Our previous research on longer-length multi-path models has proved that data-driven trajectory clustering to be an attractiv...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Speech Communication

دوره 51  شماره 

صفحات  -

تاریخ انتشار 2009